Update shapefile.py #125

FourSpaces · 2017-10-28T10:53:11Z

add __cuttingStr function , Cut non-English UTF-8 strings, convert to the specified length of the byte string. to ensure the integrity of each UTF-8 characters, will not be cut off half a character encoding.

example：
w = shapefile1.Writer()
w.shapeType = 1
w.autoBalance = 1
w.field('TEXT', 'C', size=10)
w.field('SHORT_TEXT', 'C', size=10)
w.point(121.45291, 31.27055)
w.record('Hello1', '中国的汉字1')
w.record('Hello1', '中国的汉字2')

add __cuttingStr function , Cut non-English UTF-8 strings, convert to the specified length of the byte string. to ensure the integrity of each UTF-8 characters, will not be cut off half a character encoding. example： w = shapefile1.Writer() w.shapeType = 1 w.autoBalance = 1 w.field('TEXT', 'C', size=10) w.field('SHORT_TEXT', 'C', size=10) w.point(121.45291, 31.27055) w.record('Hello1', '中国的汉字1') w.record('Hello1', '中国的汉字2')

karimbahgat · 2018-06-06T17:44:20Z

This is a good point. I guess currently there is a danger of cutting short and invalidating any utf8 characters consisting of multiple bytes at the end of a string when truncating text values.

So I really welcome this addition, but would make two small requests for changes:

I see this is based v1.2, and not the more recent 2.0. In the most recent version it first sends the value off to be encoded in b(), and I wonder if this could be implemented inside that function, given an optional 'size' arg of how many bytes to truncate the text to. That would keep things grouped and avoid creating the new special method.
Add it without all the formatting edits throughout the script. Although great, it makes it difficult to see what pertains to the utf8 handling, and not sure if any errors might have crept in there. Better as a separate PR.

Hope you can resubmit this for v2.0.

FourSpaces added 2 commits October 28, 2017 18:49

Compatible python3 with python2 encoding issues

6ab323e

karimbahgat added the enhancement label Jun 6, 2018

karimbahgat closed this Jun 6, 2018

karimbahgat mentioned this pull request Jun 6, 2018

Avoid cutting off utf8 encoding halfway when truncating text values #148

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Update shapefile.py #125

Update shapefile.py #125

Uh oh!

FourSpaces commented Oct 28, 2017

Uh oh!

karimbahgat commented Jun 6, 2018

Uh oh!

Uh oh!

Uh oh!

Update shapefile.py #125

Update shapefile.py #125

Uh oh!

Conversation

FourSpaces commented Oct 28, 2017

Uh oh!

karimbahgat commented Jun 6, 2018

Uh oh!

Uh oh!